fix: green the failing unit test suite#1296
Open
lLukDreams wants to merge 7 commits into
Open
Conversation
/root was deliberately removed from the directory blocklist in instance.ts (commits 20b79f7, 49c0d9e) because it is the root user's home directory (common in Docker/CI) and is already covered by the workspace-trust prompt. The directory-safety test was left asserting that /root rejects with "Access denied", which fails on the Linux CI runner. Align the test with the intended implementation.
The provider transform was refactored (rolling-tail/head cache breakpoints) after this test was written, so the expected Anthropic payload was stale: the ephemeral cache_control marker now sits on the last assistant tool_use and last user tool_result, not on the first user text block. Align the expectation with the current behavior.
A subtask that failed before the actor signaled onReady (e.g. the actor service is unavailable or the model is unresolvable) lost its metadata on the resulting error tool state, because metadata was only reported from onReady. Seed sessionId/model onto the tool part up-front so it survives a failure at any point during spawn.
Servers imported from .claude.json carry mcp_origins[name].type === 'claude' and are meant to connect lazily, but MCP init eagerly connected every configured server — so a Claude config auto-spawned every local MCP server on startup and the lifecycle test (expecting 'pending' until an explicit connect()) failed. Honor the origin marker: leave Claude servers pending.
…nRef tryStartCheckpointWriter read the Actor service only via the late-bound global spawnRef, which goes stale when a memoized Actor layer is torn down by instance disposal elsewhere in the process (observed as cross-test pollution: the writer skipped with 'Actor service unavailable' depending on test order). Prefer the Actor service from the live Effect context via Effect.serviceOption — always present under AppRuntime and robust to the global going stale — falling back to spawnRef. serviceOption does not add Actor to requirements, so the layer cycle that motivated spawnRef is intact.
…red DB The backfill suite inspects ALL FTS rows and only cleared the global DB in afterEach, so rows left by earlier files in the same shard leaked into its assertions (saw 11 stray part_ids). Clear the DB in beforeEach too.
Cancelling a workflow run with more agents than the concurrency ceiling (globalSem = config.workflow.maxConcurrentAgents ?? min(16, 2x cores)) could spawn orphans and intermittently hang. Two causes, two fixes: 1. Orphan spawns (runtime.ts): the promise-based agent semaphore fires queued callbacks as permits free. While reclaim cancels the started agents and frees their permits, queued siblings would acquire them and spawn NEW actors that escape the cancel sweep. Mark entry.cancelling before reclaim and bail at both spawn sites so a late-acquiring agent never starts. 2. Cancel hang (spawn.ts): Actor.cancel awaited each agent's runner cancel, which Fiber.interrupts the work fiber and awaits its full unwind. A fiber blocked on a hung LLM response stream never finishes unwinding, hanging the whole sweep (and runtime.cancel) indefinitely. Bound that interrupt with a grace period — the signal is still delivered and the registry status still records the cancellation (cancel's contract); we just stop awaiting a stuck unwind. The regression test now pins the ceiling BELOW the fan-out so some agents are always queued (deterministic on any host), exercising both paths. Verified 20/20 with forced queuing and green across all 4 CI shards.
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #
Type of change
What does this PR do?
Repairs the unit suite. The failures were a mix:
Stale tests (code changed, assertion didn't):
/rootfrom the forbidden-dirs assertion — it was intentionally removed from the blocklist.cache_controlexpectation to the rolling-tail marker placement the provider transform now uses.Real bugs:
onReadylost its metadata; seedsessionId/modelup front so the error tool state keeps it..claude.jsonMCP servers were eagerly connected instead of lazypending— themcp_origins"claude" marker wasn't honored.spawnRef, which goes stale when a memoized Actor layer is torn down by instance disposal in an earlier test. Resolve from the live context viaEffect.serviceOption(optional, so no layer cycle).Actor.cancelawaited a fiber unwind that never finishes when stuck on a hung LLM stream. Bail queued spawns once cancelling, and bound the interrupt.Test isolation:
afterEach; rows from earlier shard files leaked in. Clear inbeforeEachtoo.How did you verify your code works?
Ran the full suite on Linux across all 4 shards (matching the sharded
ubuntu-latestCI) — green, 0 failures. The workflow cancel fix specifically: 20/20 with the concurrency ceiling pinned below the fan-out to force queuing.typecheckclean.Checklist